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Abstract. We present a novel catalog-independent method, based on a scale dependent 
approach, to detect anisotropy signatures in the arrival direction distribution of the ultra 
highest energy cosmic rays (UHECR). The method provides a good discrimination power for 
both large and small data sets, even in presence of strong contaminating isotropic background. 
We present some applications to simulated data sets of events corresponding to plausible 
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based observatories. 
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1 Introduction 



In many field involving data analysis, the search for anisotropy has played a crucial role. Many 
estimators, namely correlation functions [1-3], have been proposed and widely used to search 
for clustering of objects and to measure deviation from isotropy of angular distributions. 
These methods apply to angular coordinates of objects as well to distributions of arrival 
directions of events: in this work we will indifferently refer to both as arrival direction 
distributions of events. 

If n is the number of experimental coordinates on some region S of a spherical surface 
and r is the number of coordinates coming from several Monte Carlo realizations of S, 
the common anisotropy analysis involves the computation of different estimators: Peebles 
(or natural), Davis- Peebles, Landy-Szalay and Hamilton [1-3] angular correlation function, 
respectively defined as 
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where DD is the number of pairs lying between O and O + A0 for the experimental dis- 
tribution, RR is the same number calculated for Monte Carlo realizations and DR is the 
cross-pair counts between experimental and simulated events. By definition (see Ref. [4]) 
the function Wj(0) (i = 1, 2, 3, 4) is strictly related to the excess on the area d£l of the pairs 
number diVp a i rs over the random background dNMC'- 

dN paivs - dN M c = -nn u)i(Q)d£l (1.5) 

where no is the expected average density of events on S assuming an isotropic distribution. 

A variant of the two-point angular correlation function is widely used for small data set 
of points [5-9] . It is defined as 

n i—1 
i=\ j=i 

where H is the Heaviside function and 



&ij = arccos (cos 9i cos 6j cos(0j — <pj) + sin 0j sin 9j) (1-7) 

is the angular distance between two directions % and j with coordinates (<f>, 9) on the sphere. 

Recently, new estimators have been introduced to study the anisotropy signature of 
sky's arrival direction distributions: the modified two-point Rayleigh [10], and shape-strength 
method derived from a principle component analysis of triplets of events [11]. Such a test 
statistics have been recently applied to both P. Auger data and to synthetic maps of events, 
the latter generated by sampling the Veron-Cetty & Veron catalog [12] of nearby candidate 
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active galactic nuclei (AGN) within 75 Mpc (z < 0.018), showing a higher discrimination 
power than other estimators [13]. 

Within the present work, we introduce a new fast and simple method for anisotropy 
analysis, which makes use of a multiscale approach and depends on one parameter only, 
namely the angular scale of the instrinsic anisotropy. The main advantage of our estimator 
is the possibility to analytically treat the results: the analytical approach drastically reduces 
computation time and makes available the possibility of applications to very large data sets of 
objects. We test the method on several simulated isotropic and anisotropic arrival direction 
distributions (mock maps) and perform an extensive analysis of its statistical features under 
both the null and the alternative hypotheses. However, it is worth remarking that the scope 
of applicability of our method is not limited to UHECR physics, and it is valid for any 
distribution of angular coordinates of objects. 

2 Multiscale Autocorrelation Function 

Let S be a region of a spherical surface and let Pj ((f>,0) (i = 1,2, ...,n) be a set of points 
locating n arrival directions on S, defining a sky. The sky S is partitioned within a grid of N 
equal-area (and almost-equal shape) disjoint boxes B k (k = 1,2, ...,N) as described in Ref. 
[14]. Let O be the solid angle covered by S, whereas each box Bk covers the solid angle 



where 2G is the apex angle of a cone covering the same solid angle: N, and are deeply 
related quantities that define a scale. 

Let ipk{Q) be the fraction of points in the data set falling into the box Bk'- the function 
A{Q) that quantifies the deviation of data from an isotropic distribution at the scale 0, is 
chosen to be the Kullback-Leibler divergence [15, 16] 



where tjj k (Q), generally a function of the domain meshing, is the expected fraction of points 
isotropically distributed on S falling into the box B k . The Kullback-Leibler divergence is an 
information theoretic measure widely used in hypothesis testing and model selection criteria 
[17-19], statistical mechanics [20-22], quantum mechanics [23-26], medical [27] and ecological 
[28] studies, to cite some of the most known. This measure quantifies the error in selecting 
the fraction ip(@) to approximate the fraction -0(0) and it is strictly connected to maximum 
likelihood estimation (see Appendix A). It is straightforward to show that ^4(0) is minimum 
for an isotropic distribution of points, or, in general, when ^(0) ~ ip(@), i-e. if the model is 
correct. 

If ^4data(0) and Ai so (0) refer, respectively, to the data and to an isotropic realization 
with the same number of events, we define multiscale autocorrelation function (MAF) the 
estimator 
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(a) Static counting (b) Dynamical counting 



Figure 1. A cluster of three points falling into different boxes. 

where (^4; so (0)) and <7A iao (0) are the sample mean and the sample standard deviation, re- 
spectively, estimated from several isotropic realizations of the data. If %q denotes the null 
hypothesis of an underlying isotropic distribution for the data, the chance probability at the 
angular scale G, properly penalized because of the scan on G, is the probability 

p(q) = Pr ( Siso (e') > s data (e)|fto,ve' e v) (2.3) 

obtained from the fraction of null models giving a multiscale autocorrelation, at any angular 
scale 0' in the parameter space V, greater or equal than that of data at the scale 0. The 
null hypothesis is eventually rejected in favor of the alternative "Hi = ~^Hq — being -> the 
negation operator — at the angular scale 0, with probability 1 — p(@). 

Under the null hypothesis %q, the estimator s(0) follows a half-Gaussian distribution, 
independently on the value of the angular scale and on the number of events on S, as it 
will be successively shown in the text. 

3 Dynamical counting 

The simplest definition of the counting algorithm, as shortly described in Section 2, involves 
the fixed grid introduced in Ref. [14], where each box only embodies the relative number 
of events falling in it. Unfortunately, such a static counting approach could not reveal an 
existing cluster. For instance, Figure la shows a typical scenario where some points of a given 
triplet fall into different cells. Indeed, the fixed grid may cut a cluster of points within one 
or more edges, causing a further loss of information at the angular scale under investigation. 
To overcome this possible loss of information, we introduced a type of smoothing of the grid 
by applying it on the data. 

The smoothing, adopted in our study, deals with a new counting procedure for the esti- 
mation of the density ipk(&)- However, such a density depends on the observatory's exposure, 
generally a function on the celestial sphere depending on both the latitude of the experiment 
and the maximum zenith of detection, quantifying the effective time-integrated detection 
area for the flux of particles from each observable sky position. The relative exposure oj is 
the dimensionless function corresponding to the exposure normalized to its maximum value. 
For a single full-time operating detector, i.e. with constant exposure in right ascension, 
fully efficient for particles arriving with zenith angles smaller than fj maX) it depends on the 
declination 5 and it is defined as [29] 

u (5) oc cos 4>q cos S sin a m + a m sin <f>Q sin 5, (3-1) 
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where (j>o is the detector latitude and 



a,. 



' o £ > 1 

vr f < -1 (3.2) 
^ cos" 1 £ otherwise 



with 

cos # max — sin 4>q sin 5 



cos (^o cos (5 



Given an angular scale O, for each point Pj falling inside a box, we consider a set of 8 
new points lying on a virtual box centered on Pj. Let ct^o and <5j o De > respectively, the right 
ascension and the declination of the point Pj. We introduce the following notation: 

<5i,±l = <5i,o =•= y 
«i,±i = «i,o ± fl 1 (8i,o) 

Oii,±2 = CXifl ± 5 (5t,+l) 
Oi,±3 = «i,0 ± 5 (^i,-l) 

where <?(•) is a function obtained from Eq. (1.7), depending on declination, that constrains 
the angular distance between each of the 8 points and the original one to be Within this 
framework, for each Pj(oi 0j <^,o); we have the following 9 extended points: 

• The original point Pj(o!i,o, <^i,o); 

• The up Pi(a i)0 , S ij+ x) and down Pi(a i)0 , Si _i) points; 

• The left P^on _i, ^ >0 ) and right P i (a i) i, 5 i)0 ) points; 

• The up- left Pj(aj _ 2 , and up-right Pi(a iy2 , 8i,+i) points; 

• The down-left P,(Qj 5 _3, 5i,_i) and down-right Pi(oLi t 3,5i-i) points. 

We define the function h(5ij) = w(5ij)/u)(6io) (j = 0,±1) and introduce the weights 

f(Si ' j) = + 3 +W, +1 ) (3 ' 3) 

In other words, we weight the angular region around a given direction with the local 
value of the exposure. Finally, we follow the procedure described in Section 2 by using the 
weighted distribution of points instead of the original one, as shown in Figure lb: thus, the 
density function is defined as the fraction of extended points, opportunely weighted the 
by function f(5ij) defined in Eq. (3.3). Our numerical studies show that such a dynamical 
counting approach recovers the correct information on the amount of clustering in the data. 

In fact, the main difference between the static and the dynamical counting lies in the 
value of the estimator when the procedure is applied to Monte Carlo realizations of the 
sky. For instance, let us consider the Figure 2, where we show a clustered (Figure 2a) and 
an unclustered (Figure 2b) set of points. The static counting is not able to recover the 
differences between the two configurations. Conversely, if the dynamical counting is applied, 
the extended points in Figure 2a are concentrated in two adjacent boxes while in Figure 
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(a) Clustered points (b) Unclustered points 



Figure 2. a) Three clustered points: the extended points are mainly concentrated in two adjacent 
boxes, b) Three unclustered points: the extended points are mainly distributed on the neighbor cells. 

2b they are distributed on the neighbor cells. This fundamental difference is reflected in 
the density function, leading to two different Monte Carlo skies producing the same 

clustered configuration shown in Figure 2a, and of consequence the same weight distribution, 
are not frequently expected: in this case, the value of s{&) should be greater than that one 
estimated from the static method. The direct consequence of a greater value of the estimator 
s(@) is a lower chance probability and the main advantage of using the dynamical counting, 
instead of the static one, should be the lowest penalization of s(Q) only if an anisotropy 
signal is really present. 

In order to illustrate the importance of dynamical counting in the anisotropy signal 
detection, we have generated 5000 isotropic and anisotropic skies of 100 events each. In 
each anisotropic sky, 60% of events are normally distributed, with dispersion p, around 10 
random sources and 40% of events are isotropically distributed. For each angular scale Q, 
we have estimated the average value of s(G) with the static and the dynamical counting, 
separately. Results are shown in Figure 3 for p = 5° (a), p = 10° (b), p = 20° (c) and for 
the isotropic map (d). As expected, the two counting methods do not show differences in 
the estimation of MAF in the case of isotropic skies, resulting in the same flat average value 
of s(G). Conversely, in the case of the anisotropic skies, the dynamical counting provides a 
greater estimation of s(Q) than the static counting, leading to a smaller estimation of the 
corresponding chance probability and improving the signal-to-noise ratio. In the next section 
we will show how the dynamical counting is able to correctly recover the most significant 
clustering scale. For sake of completeness, we have generated all mock maps with a full-sky 
coverage and a uniform exposure. 

4 Interpretation of MAF 

Any catalog-independent method provides information about the angular scale 0* where the 
significance is minimum. In the case of a simple two-point method, such an angular scale is 
quite difficult to interpret and topologically different configurations of events lead to the same 
significance. In the case of the modified two-point Rayleigh method, the estimation of the 
significance includes another set of parameters, independent from the angular distribution, as 
described in Ref. [10]: parameters are sensitive to the orientation of the pairs and therefore 
to skies showing preferential directions and filamentary structure of points. It follows that G* 
is the most significant angular size for the mix of these informations, still linked to the pair 
configuration. In the case of the shape-strength method, the estimation of two parameters, 
namely the shape and strengh, is performed: both can be interpreted, respectively, in terms 
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Angular Scale, (deg) Angular Scale, (deg) Angular Scale, (deg) Angular Scale, (deg) 

Figure 3. MAF: average s(0) (solid line) estimated from 5000 isotropic and anisotropic skies of 
100 events each. In each anisotropic sky, 60% of events are normally distributed, with dispersion p, 
around 10 random sources and 40% of events are isotropically distributed. The dashed line indicates 
the value of the dispersion adopted to generate the corresponding mock map: a) 5, b) 10 and c) 20 
degrees; d) isotropic map. 

^ i rfl in i nP in 
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Angular Scale, (deg) Angular Scale, (deg) Angular Scale, (deg) Angular Scale, (deg) 

Figure 4. MAF: average chance probability (solid line), with 68% region around the mean value, 
estimated from isotropic and anisotropic skies generated as explained in Figure 3. Dynamical counting 
is used. The dashed line indicates the value of the dispersion adopted to generate the corresponding 
mock map: a) 5, b) 10 and c) 20 degrees; d) isotropic map. 




of size and elongation of the triangles denned by a triplet of points. It follows that all 
information is recovered from the configuration of triplets. 

In the specific case of MAF, the angular scale 0*, where the significance is minimum, 
turns to be the significative clustering scale: it is the scale at which occurs a greater accu- 
mulation of points respect to that one occuring by chance, with no regard for a particular 
configuration of points, e.g. doublets or triplets. To illustrate better the clustering scale de- 
tection feature of MAF, we have generated 5000 isotropic and anisotropic skies of 100 events 
each, as previously described at the end of Section 3. Figure 4 shows the average chance 
probability, with 68% region around the mean value, versus the angular scale for three values 
of the dispersion, namely p = 5° (a), p = 10° (b), p = 20° (c), and for the isotropic map (d). 
As expected, chance probability is close to one and nearly flat in the case of the isotropic 
map, because all clustering scales are equally likely. Conversely, for all anisotropic maps, the 
average chance probability gets a minimum around the corresponding value of p. Thus, our 
estimator is able to recover the most significant clustering scale. It should be remarked that 
when the 20° dispersion is used, the angular scale of the minimum is less obvious because of 
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the large fluctuations due to the isotropic contamination. Finally, it is worth noticing that we 
have observed that the curve around the value of p gets narrower by increasing the number 
of events. 



5 Statistical analysis of MAF 

In this section, we investigate the statistical features of MAF by inspecting its behavior under 
both the null or the alternative hypothesis. In particular, we estimate the significance a (or 
Type I error), i.e. the probability to wrongly reject the null hypothesis when it is actually 
true, and the power 1 — f3 (where f3 is known as Type II error), i.e. the probability to accept 
the alternative hypothesis when it is in fact true. In the following we will adopt the dynamical 
counting previously discussed. 

Null hypothesis. We generate isotropic maps of 10 5 skies, by varying the number of 
events from 20 to 500: for each sky in each map, we estimate the MAF for several values of 
the angular scale G. Hence, we choose the value of = 0* where the chance probability is 
minimum, as the most significant clustering scale: 



properly penalized because of the scan on the parameter 0, according to the definition in 
Eq. (2.3). Indipendently on the number of events in the mock map, we find an excellent flat 
distribution of probabilities p(0*), shown in Figure 6a for skies of different size, as expected 
for analyses under the null hypothesis %q. In other words, MAF is not biased against Hq, as 
required for good statistical estimators. 

Because of the definition in Eq. (2.1) and of the central limit theorem, a Gaussian dis- 
tribution is expected for the function A(Q), and of consequence, the half-normal distribution 



for cr(0) = 1, is expected for the estimator s(0) defined as in Eq. (2.2), being normalized to 
zero mean and unitary variance. In Figure 5 are shown the distributions of the MAF estimator 
for n = 40 and n = 100 events, for angular scales ranging from 2° to 26°, separately. We 
find an excellent agreement between the distribution for Monte Carlo realizations and the 
expected one. It follows that the (unpenalized) probability to obtain by chance a value of 

the MAF, greater or equal than a given value sq, is just 1 — erf (^^j > being erf the standard 
error function, independently of the angular scale 0. 

Although this important feature of the MAF estimator, generally the distribution of 
s max = max{s(0)} is of interest for applications, because of the required penalization due 
to the scan over the parameter 0. Hence, it is important to identify the distribution of 
the penalized probability p(0), if any. Intriguingly, our numerical studies show that such a 
distribution exists and it corresponds to one of the limiting densities in the extreme value 
theory (see Appendix B). In particular, the cumulative density of maxima is known as the 
Gumbel distribution [30, 31]: 



p(@*) = argminp(0) 



V2^o-{0) 



2 



s 2 (e) 
e 2o- 2 (e) 



(5.1) 



G(x) = exp — exp 
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MAF, s(0) MAF, s(0) 



Figure 5. Distribution of the MAF estimator for n — 40 and n — 100 events, for angular scales 8 
ranging from 2° to 26°, separately. Solids lines correspond to the expected half- normal distribution. 




Minimum p-value Maximum s(0) 



Figure 6. MAF. a) Distribution of p(Q*) for n = 40,60,80,100 and 500 events, b) Distribution of 
max{s(8)} for n = 40, 60, 80, 100 and 500 events. Solid line correspond to the least-square fit of the 
Gumbel density with parameters \i = 1.743 ± 0.002 and a = 0.470 ± 0.002 (% 2 / ndl = 1-1 x iO -5 )- 



where [i and a are the location and shape parameters, respectively, and the corresponding 
probability density is 

g (x) = - exp 
a 

In Figure 6b are shown the probability densities of s max for n = 40, 60, 80, 100 and 
500 events: independently on n, each density is in excellent agreement with the Gumbel 
distribution of extreme values, for the parameters fi = 1.743 ± 0.002 and a = 0.470 ± 0.002. 
Such values correspond to the mean and to the standard deviation of the distribution, p, ~ 
2.00 and a ~ 0.59, respectively (see Appendix B). It follow that the probability to obtain 
a maximum value of s(0), at any angular scale O, greater or equal than a given value 



exp 



x — fl 
a 



(5.2) 
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max{s(0)} is 

'max{s(0)} — fj, 



p (max{s(0)}) = 1 — exp 



exp 



a 

providing an analytical expression for the penalized probability denned in Eq. (2.3). 

Alternative hypothesis. In order to investigate the behavior of MAF under the alter- 
native hypothesis of an underlying anisotropic distribution of objects, we have generated 
anisotropic maps of 10 skies, by varying the number of events from 20 to 100. In general, 
the anisotropy of a sky depends on several factors: for instance, in the case of cosmic rays, 
it depends on the distribution of sources, on magnetic fields and on propagation effects as 
energy loss or the GZK cutoff [32, 33] (and Ref. therein). Thus, a more complicated approach 
is required for the Monte Carlo realization of the maps. In order to estimate the power of 
MAF, we build reasonable anisotropic maps reflecting in part the real-world scenario, keeping 
in mind that our purpose is to build an anisotropic set of events for statistical analysis and 
not to generate events mimicking real data sets with the best available approximation. We 
proceed as follows: 

1. Catalog of candidate sources. Although several models for production mechanisms of 
UHECR are available [32, 33] (and Ref. therein), [34-41], it is generally accepted that 
the candidate sources are extragalactic and trace the distribution of luminous matter 
on large scales [42]. In particular, it has been shown that correlation with possible high 
redshift sources is unlikely [43], whereas compact sources are favored [44, 45]: the recent 
result reported by the P. Auger Collaboration experimentally supports the latter claim, 
showing a high correlation between the observed data and the distribution of nearby 
active galactic nuclei (AGN) [46, 47]. For these reasons, we use the Palermo Swift-BAT 
hard X-ray catalogue of AGN with known redshift within 200 Mpc (z < 0.047) [48], 
as the reference catalog of candidate sources providing the most complete and uniform 
all-sky hard X-ray survey up to date. 

2. Source effects. Events, from each source in the reference catalog, are generated by 
weighting for the source flux and for the expected geometrical flux attenuation. Hence, 
the number of events coming from a source is proportional to its flux and to the fac- 
tor z~ 2 : because of the small scales and the high energy of cosmic rays involved in 
anisotropy studies [E > 4.0 x 10 19 EeV), we assume a flat universe with zero cosmo- 
logical constant (VL = 1, A = 0) and nonevolving source. Indeed, we naively take into 
account the possible deflections of the particles, due to the random component of the 
magnetic field, by producing arrival directions gaussianly-distributed with dispersion 
p around the source. It is worth remarking that such a dispersion is strictly related 
to both the injection energy and the mass of the particle, as well as other physical 
quantities [32]. 

3. Background. We take into account the possibility for a contaminating isotropic back- 
ground of the anisotropy signal, by generating a number of events isotropically dis- 
tributed, corresponding to a fraction /i SO of the whole data set. 

4. Detection effects. As previously explained, the number of events detected by a single 
fully efficient and full-time operating surface detector, depends on its own relative 
exposure. In order to take into account such a detection effect, we generate the events 



- 10 - 



Experiment 


00 


$max 


Exp. (m 2 s sr) 


A 


#Ev. 


Volcano R. 


35.15°N 


70° 


0.2 x 10 ie 


1.000 


6 


Yakutsk 


61.60°N 


60° 


1.8 x 10 16 


0.625 


20 


H. Park 


53.97°N 


74° 




1.000 


7 


AGASA 


35.78°N 


45° 


4.0 x 10 16 


0.750 


29 


SUGAR 


30.43° S 


70° 


5.3 x 10 16 


0.500 


13 


P. Auger 


35.20°S 


60° 


28.4 x 10 16 


1.200 


27 



Table 1. Surface detectors: positions, maximum zenith angles # ma x, exposures, energy shift factors 
and number of detected events with rescaled energy E' > 4.0 x 10 19 EeV. 

according to the relative exposure of the single detector. Moreover, for each detector we 
generate the corresponding number of events reported in Table 1, in order to produce 
skies mimicking as much as possible real data currently available. 

However, for a more realistic distribution of events, several more constraints, in general 
based on further assumptions or debated models, are required: the mass of the particle, 
the injection spectrum of the source, the intervening magnetic field, to cite some of the 
most important. In our study of the MAF discrimination power, we fix p = 3°, as the 
mean angular deviation of UHECR in the galactic and extra- galactic magnetic field, and a 
background fraction /; so = 0.3. 

In order to produce a likely map of UHECR, we choose to generate events distributed 
in the whole sky, according to the number of events collected by surface detectors in the last 
decades. In particular, we consider events with energy E > 4.0 x 10 19 EeV and error on 
the arrival direction smaller than 5°, as detected at the Sidney University Giant Airshower 
Recorder (SUGAR) [49], Akeno Giant Air Shower Array (AGASA) [50], Haverah Park [51], 
Volcano Ranch (one event from [32] and six events from [52]), Yakutsk [53], P. Auger Ob- 
servatory [47]. However, the fluxes of particles as measured by those experiments do not 
agree each other in the absolute fluxes, and a rescaling is needed [54]. By assuming that the 
spectrum reported by the HiRes Collaboration [55] corresponds to the correct energy scale, 
the rescaling, based on some specific characteristics of the UHECR spectrum, fixes the en- 
ergy shift factors A for the other experiments [9, 54]. Positions, maximum zenith angles ^ max , 
exposures and energy shift factors are reported in Table 1, for each experiment, as well as the 
number of detected events with rescaled energy E' > 4.0 x 10 19 EeV (E' = XE). In Figure 
7 is shown the relative geometrical exposure of each single detector listed in Table 1, as well 
as the joint exposure of all experiments. For reference, in Figure 8a is shown the all-sky data 
set of 102 detected events with rescaled energy E' , superimposed on the distribution of AGN 
within 200 Mpc from the reference catalog, whereas in Figure 8b is shown the mock map of 
simulated events according to physical constraints previously described. 

In Figure 9 we show the power 1 — /3 vs. the number of events, generated as described 
above. A sky is labelled as anisotropic if, for a fixed value of the significance a, the penalized 
chance probability as defined in Eq. (2.3) is lesser or equal than a, i.e. if the condition 

p(&*) = argminp(O) < a 

holds for some angular scale 0*. In Figure 9 is shown the power for two values of the 
significance threshold, namely a = 0.1% and a = 1%, estimated through the analytical 
approach. For applications, a power of 90% is generally required: under this threshold the 
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-80 -60 -40 -20 20 40 60 80 

Declination, 5 (deg) 

Figure 7. Relative geometrical exposure of each single detector listed in Table 1 (lines and points), 
and the joint exposure of all experiments (solid line). 




(a) UHECR events and candidate sources. (b) Mock 



Figure 8. a) All-sky data set of 102 detected events with rescaled energy E 1 > 40 EeV (see the text 
for further information) superimposed on the distribution of AGN with known redshift (z < 0.047) 
from the Palermo SWIFT-BAT hard X-ray catalogue; b) corresponding mock map generated for the 
statistical analysis (see details in the text). Equatorial coordinates are shown. 

method could miss to detect an existing anisotropy signal. In the case of the MAF, and for 
the considered anisotropic mock map, the power increases with the number of events n and 
it is able to detect the anisotropic signal for n > 60, with significance a = 1%. However, 
by decreasing the significance for the statistical test, the power requires a greater number of 
events to reach the 90% threshold, as expected: our test clearly shows that the MAF provides 
an excellent discrimination power for n > 80. Indeed, we verified the agreement between the 
analytical and the Monte Carlo estimations of the discrimination power. 

6 Discussion and conclusion 

We introduced a new statistical test, based on a multiscale approach, for detecting an 
anisotropy signal in the arrival direction distribution of UHECR, that makes use of an in- 
formation theoretical measure of similarity, namely the Kullback-Leibler divergence, and of 
the extreme value theory. Within the present work we showed that our procedure is suitable 
for the analysis of both small and large data sets of events, by applying it on several Monte 
Carlo realizations of isotropic and anisotropic synthetic data sets, corresponding to plausible 
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Figure 9. MAF power vs. the number of events sampled from anisotropic mock maps generated as 
described in the text, for values of the significance corresponding to a = 0.1% and a = 1%. 



scenarios in the physics of highest energy cosmic rays. In fact, for small data sets as well 
as for larger ones, the method is able to recover the information about the most significant 
angular scale of clustering in the data, even in presence of strong isotropic contamination. 

The advantages of our approach over other methods are multiples. First, the method 
allows an analytical description of quantities involved in the estimation of the amount of 
anisotropy signal in the data, avoiding thousands of Monte Carlo realizations needed for the 
penalizing procedure of results and drastically reducing the computation time. Second, the 
method allows the detection of a physical observable, namely the clustering scale, in the 
case of a point source. In the case of multiple sources, the information is about the most 
significant clustering scale(s), according to source distribution. Third, the method is unbiased 
against the null hypothesis and it provides a high discrimination power even in presence of 
strong contaminating isotropic background, for both small and large data sets. Although 
in this work we referred to UHECR physics for our applications, it is worth remarking that 
the method is suitable for the detection of the anisotropy signal in each data set involving 
a distribution of angular coordinates on the sphere, and it can be adapted to non-spherical 
spaces by properly redefining the dynamical counting algorithm. 
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A The Kullback-Leibler divergence 

Let P and Q be two probability distributions, with densities p{x) and q(x), respectively. The 
Kullback-Leibler (KL) divergence is a measure quantifying the error in approximating the 
density p(x) by means of q(x), and it is defined [15, 16] as 



T>KLip\\q) = J p(x)\og^dx (A.l) 



The KL divergence is non-negative, i.e. T>KL(p\\q) > with equality if and only if P = Q, 
and asymmetric, i.e. T>kl(p\\q) 7^ ^KL(q\\p)- The statistical interpretation of KL divergence 
is as follows. 

Let P the empirical distribution of random outcomes X{ (i = l,2,...,n) of the true 
distribution P, putting the probability - on each outcome as 



n 

n 



p(x) = - Six - xA (A.2) 

T) L- ^ 



n 
i=l 



and let Q® be the statistical model for the data, depending on the unknown parameter 0. 
It follows 

T>K L (p\\qe) = -n{p) - f p(x) log q(x\e)dx (A.3) 

where H{p) is the information entropy of p, not depending on 0, whereas p and q® = q(x\&) 
are the corresponding densities of P and Q®, respectively. Putting Eq. (A.2) in the right- 
hand side of Eq. (A.3): 

1 n 

-DKdpWm) = -nip) - - Viog^l©) 

n ^-^ 

i=i 

= -H(p) - ^C q (Q\x) (A.4) 
where C q (Q\x) is the log- likelihood of the statistical model. It directly follows that 

argmin2?Kx(p||<Ze) = — arg max£„(0|x) (A. 5) 

e n e 

where the function argmin(argmax)/(0) retrieves the minimum (maximum) of the function 
/(©)■ Hence, another way to obtain the maximum likelihood estimation it to minimize 
the KL divergence [56]; indeed, it can be shown that the KL divergence corresponds to the 
expected log-likelihood ratio [57]. 

B The Gumbel distribution 

Extreme value theory is the research area dealing with the statistical analysis of the extremal 
values of a stochastic variable. Let X{ (i = 1, 2, n) be i.i.d. random outcomes of a distri- 
bution F. If M n = maxjxi, X2, x n }, the probability to obtain an outcome greater or equal 
than M n is: 

Pr(M„ <x)= Pr(xi < x, x 2 < x, x n < x) = F n (x) 
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It can be shown that the limiting distribution F n (x) is degenerate and should be normalized 
[58]. However, if there exists sequences of real constants a n > and b n such that 



Pr 



M n - b n 



< x ) = F n (a n x + b n 



then 



lim F n (a n x + b n ) = G(x) (B.l) 

n — >oo 

The function G(x) is the generalized extreme value (GEV) or Fisher-Tippett distribution 



G(z) 



exp (— e z ) 



exp 



x — fl 



(B.2) 



defined for 1 — £z > if £ ^ and for z € K if £ = 0, where fx, a and £ are the location, scale 
and shape parameters, respectively. The Gumbel distribution is related to the distribution 
of maxima [30, 31] and it is retrieved for £ = [58]. The corresponding probability density 
g(x) is easily obtained from G as 



g{x) = - exp 
a 



x — {JL 



exp 



x — fl 
a 



(B.3) 



Finally, the two parameters (i and a can be related to the mean ft, and to the standard 
deviation a of the distribution, by means of the following relations: 



ft = fi + 7<7 



a 2 



7l 



-a 



(B.4) 
(B.5) 



where 7 = 0.577215... is the Euler constant. 
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